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ABSTRACT 

Sequencing  the  genome  of  the  crop  Solatium  lycopersicum  will  also  help  to  identify  beneficial  genes  in  other 
plant  relative  of  the  tomato  such  as  potato,  pepper.  All  of  these  crops  are  members  of  the  solanaceae  or  nightshade 
family,  one  of  the  World  most  important  vegetable  plants  families  in  term  of  both  economic  value  and  production 
volume.  Developing  better  tomatoes  will  also  contribute  to  the  quest  for  global  food  security.  As  well  as  using  this  new 
genome  information  to  develop  a wide  variety  of  beneficial  traits,  the(TGRD)the  tomato  genomic  resources  database  is 
an  online  and  interactive  relational  database  developed  using  open  sources  software.  The  user-friendly  interface  for 
TRGD  has  been  developed  using  java  script  and  HTML  to  query  and  retrieve  the  data  based  on  userneeds.  In  sequence 
alignment  is  a way  of  arranging  the  sequence  of  DNA,  RNA,  or  protein  to  identify  the  functional  structural  or 
evolutionary  relationship  between  the  sequence.  If  two  sequence  is  an  alignment  share  a common  ancestors,  mismatches 
can  be  interpreted  as  a point  mutation.  Fasta  format  is  a text  based  format  for  representing  either  nucleotide  sequence  or 
peptide  sequence,  in  which  nucleotide  or  amino  acid  are  represented  using  single  letter  code.  Sequence  homology  is  a 
general  term  that  indicates  evolutionary  relatedness  among  sequence.  NCBI  that  provide  a common  data  extraction 
platform  for  sequence  analysis,  sequence  similarity  is  a substitution  with  similar  chemical  properties.  The  clutalw 
colored  alignment  also  have  the  colour  option  in  the  output  results.  The  colouring  residue  takes  place  according  to  the 
following  physio  chemical  criteria(Red,  blue,  green,  magenta,  and  grey  colours).In  addition  to  maintaining  the  gene  bank 
nucleic  acid,  sequence  database,  Ncbi  provide  data  retrieval  system  and  computational  resources  for  the  analysis  of  gene 
bank  data  and  variety  of  other  biological  data  made  available  through  Ncbi. 
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INTRODUCTION 

The  aim  of  tomato  genome  sequencing  is  to  reveal  and  explore  the  genetic  variation  availability  in  tomato. 
Tomato  has  been  selected  as  a target  crop  because  it  is  economically  one  of  the  most  important  crop  species. 
The  programme  can  run  online  from  the  EBI  web  server.  The  sources  code  executables  for  window,  linux  are 
available  from  EBI.  The  clustal  series  of  programe  are  widely  used  in  molecular  biology  for  the  multiple  alignment 
of  both  nucleic  acid  and  protein  sequence  and  for  preparing  the  phylogenetic  trees  .Taylor  Willie,  Higgins  Des 
2000,  Bioinformatics.  New  features  include  NEXUS  and  FASTA  format  output,  printing  range  numbers  and  fater 
tree  calculations,  clustalw  originally  developed  to  run  on  local  computers;  numerous  web  server  have  been  setup. 
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notably  at  the  EBI(European  bioinformatics  institute). Tomato  have  been  used  extensively  for  genetic  studies  because  of 
several  reason  such  as  its  diploid  genome,  short  generation  time,  efficient  transformation  technology.  The  data  can  be 
submitted  and  accessed  via  the  world  wide  web(  Mount  .David  2004),.  The  tomato  genome  resources  database  is  a 
interactive  relational  database  developed  using  open  sources  bioinformatics  software.  Sequence  analysis  created  a huge 
impact  on  solanaceae  research,  using  pairwise  alignment  to  find  the  best  matching  in  query  sequences.  Fasta  format  is  a 
text  based  format  for  representing  either  nucleotide  sequence  or  protein  sequence  (Higgins,  D.  G.;  Sharp , P.  M.  (1989) 
.The  formate  originate  from  the  fasta  software  package.  For  DNA  and  Protein  it  is  represented  in  one  letter  IUPAC 
nucleotide  code  and  amino  acid  code.  It  is  find  the  local  similarity  between  the  sequence  and  calculates  the  statistical 
significance  of  matches.  Mismatch  would  be  connected  with  a space.  Using  bioinformatics  tools  clustalw  is  a widely  used 
multiple  sequence  alignment  in  computer  program  ( Higgins , D.  G.;  Bleasby,  A.  J.;  Fuchs,  R.  (1992)..  An  alignment  will 
display  by  default  the  following  symbols  denoting  the  degree  of  conservation  observed  in  each  column. fasta  produce  local 
alignment  score  the  comparison  of  the  query  sequence  to  every  sequence  in  the  database.  Thompson,  J.  D.;  Gibson,  T.  J.; 
Plewniak,  F.;  Jeanmougin,  F.;  Higgins,  D.  G.  (1997)..  Sequence  alignment  or  Sequence  comparisons  lies  at  the  heart  of  the 
bioinformatics,  which  describe  the  way  of  arrangement  DNA  and  RNA  to  identify  the  regions  of  similarity  among  them. 

MATERIALS  AND  METHODS 

The  national  center  of  biotechnology  information(NCBI)  is  a multidisciplinary  research  group  that  serves  as  a 
resources  for  molecular  biology  information  developing  new  method  to  deal  with  the  volume  and  complexity  of  data 
searching  and  methods  that  can  analyze  the  structure  and  function  of  macromolecules  creating  computerized  systems  for 
storing  and  analyzing  data.  The  primary  database  retrieval  system  at  NCBI,  which  links  together  several  database  including 
gene  bank.  Fasta  is  available  as  a part  of  a package  of  program  that  construct  local  and  global  sequence  alignment.  For  a 
more  complete  description  of  fasta  and  related  programs  for  identifying  related  DNA/RNA  sequence,  for  evaluating  the 
statistical  significance  of  sequence  similarities. 

Database  and  Corresponding  Web  services 

Database  name Web  services  type : .URL 

NCBI  E — Utility  web  services  (http://www.cbi.nlm.nih.gov 

FASTA  www.ebi.ac.uk/tools 

Clustal  omega  http://www.ebi.ac.uk/Tools/msa/clustalw2/ 

EMBL/EBI  EMBL-EBI  web  services  (http://www.ebi.ac.uk/tools/ 

Uniprot  KB  Programmatic  access  services  (http://www.uniprot.org) 

EBI/ftp  site:  ftp://ftp.ebi.ac.uk/pub/software/clustalw2/ 

RESULTS  AND  DISCUSSIONS 

The  FASTA  file  format  now  largely  used  by  other  sequence  database  search  tools  which  takes  input  as  nucleotide 
or  protein  sequence  program  (clustalW)  clustal  is  a widely  used  multiple  sequence  alignment  that  manipulate  existing 
alignment,  profile  analysis  and  create  phylogenetic  tree.  Alignment  can  be  done  by  two  method  slow/accurate, 
fast/appropriate.  Clustal  omega  is  a new  multiplae  sequence  alignment  program  that  high  profile  technique  to  generate 
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alignment  between  two  or  more  sequences,  local  sequence  alignment  program  report  alignment  scores  for  the  alignment 
constructed,  and  related(homologous)sequences  will  have  higher  alignment  scores.  The  statistical  significance  of  an 
alignment  score  is  more  widely  accepted  as  a metric  to  comment  on  the  relatedness  of  the  two  sequence  being  aligned. 
The  clustalW  and  clustalx  multiple  sequence  alignment  program  have  been  completely  rewritten  in  C++  (Chenna  R, 
Sugawara  H , Koike  T,  Lopez  R,  Gibson  TJ,  His  sins  DG,  Thompson  JD  (2003).  This  facilitate  the  further  development  of 
the  alignment  algorithms  in  the  future  and  has  proper  portion  of  the  program  to  the  latest  version  of  linux,  window 
operating  system.(Availability-the  program  can  be  run  online  from  the  EBI  web  server. 

http://www.ebi.ac.uk/tools/clutalW2.The  clustal  series  of  program  are  widely  used  in  molecular  biology  for  the  multiple 
alignment  of  both  nucleic  acid  and  protein  sequence  and  preparing  phylogenetic  trees.  Clustal  was  originally  developed  to 
run  on  local  computer,  numerous  web  server  have  been  setup,  notably  at  the  EBI(European  bioinformatics 

institute). clustalW  improving  the  sensitivity  of  progressive  multiple  sequence  alignment  through  sequence  weighting 
position  specific  ,gap  penalities.  clustalW  as  a data  exploration  tools  rather  than  as  a definitive  analysis  method.  Multiple 
sequence  alignments  are  now  one  of  the  most  widely  used  bioinformatics  analysis.  clustalX  2.0  is  the  new  version  of  the 
new  version  of  the  clustalX  graphical  alignment  tool. 

>sequence  l>gill015606183lgblAH001374.2l  Solanum  lycopersicum  chlorophyll  a/b-binding  protein  Cab-3B 
genes,  partial  cds 

>Sequence2>gill015606182lgblAH001373.2l  Solanum  lycopersicum  chlorophyll  a/b-binding  protein  Cab-3A 
genes,  partial  cds 

Pairwise  Statistical  Significant  Estimation 

Consider  the  pairwise  statistical  significance  described  in  obtainable  by  the  following  function:  where  sequencel 
and  sequence  2, and  sc  is  the  scoring  scheme/substitution  matrix,  gap  penalties), and  N is  the  number  of  shuffles. 

CONCLUSIONS 

ClustalW  is  very  useful  in  predicting  the  function  and  structure  of  protein/DNA  and  in  identifying  new  member  of 
protein  family.  An  alignment  will  display  by  default  the  following  symbol  denoting  the  degree  of  conservation  observed  in 
each  coloumn.  Evolutionary  relationship  can  be  seen  through  clado  branch  or  phylobranch. 
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